treatment level
Debiased Machine Learning for Conformal Prediction of Counterfactual Outcomes Under Runtime Confounding
Barnatchez, Keith, Josey, Kevin P., Nethery, Rachel C., Parmigiani, Giovanni
Data-driven decision making frequently relies on predicting counterfactual outcomes. In practice, researchers commonly train counterfactual prediction models on a source dataset to inform decisions on a possibly separate target population. Conformal prediction has arisen as a popular method for producing assumption-lean prediction intervals for counterfactual outcomes that would arise under different treatment decisions in the target population of interest. However, existing methods require that every confounding factor of the treatment-outcome relationship used for training on the source data is additionally measured in the target population, risking miscoverage if important confounders are unmeasured in the target population. In this paper, we introduce a computationally efficient debiased machine learning framework that allows for valid prediction intervals when only a subset of confounders is measured in the target population, a common challenge referred to as runtime confounding. Grounded in semiparametric efficiency theory, we show the resulting prediction intervals achieve desired coverage rates with faster convergence compared to standard methods. Through numerous synthetic and semi-synthetic experiments, we demonstrate the utility of our proposed method.
Causal Discovery and Inference towards Urban Elements and Associated Factors
Feng, Tao, Zhang, Yunke, Fan, Xiaochen, Wang, Huandong, Li, Yong
To uncover the city's fundamental functioning mechanisms, it is important to acquire a deep understanding of complicated relationships among citizens, location, and mobility behaviors. Previous research studies have applied direct correlation analysis to investigate such relationships. Nevertheless, due to the ubiquitous confounding effects, empirical correlation analysis may not accurately reflect underlying causal relationships among basic urban elements. In this paper, we propose a novel urban causal computing framework to comprehensively explore causalities and confounding effects among a variety of factors across different types of urban elements. In particular, we design a reinforcement learning algorithm to discover the potential causal graph, which depicts the causal relations between urban factors. The causal graph further serves as the guidance for estimating causal effects between pair-wise urban factors by propensity score matching. After removing the confounding effects from correlations, we leverage significance levels of causal effects in downstream urban mobility prediction tasks. Experimental studies on open-source urban datasets show that the discovered causal graph demonstrates a hierarchical structure, where citizens affect locations, and they both cause changes in urban mobility behaviors. Experimental results in urban mobility prediction tasks further show that the proposed method can effectively reduce confounding effects and enhance performance of urban computing tasks.
Integrating Active Learning in Causal Inference with Interference: A Novel Approach in Online Experiments
Zhu, Hongtao, Zhang, Sizhe, Su, Yang, Zhao, Zhenyu, Chen, Nan
In the domain of causal inference research, the prevalent potential outcomes framework, notably the Rubin Causal Model (RCM), often overlooks individual interference and assumes independent treatment effects. This assumption, however, is frequently misaligned with the intricate realities of real-world scenarios, where interference is not merely a possibility but a common occurrence. Our research endeavors to address this discrepancy by focusing on the estimation of direct and spillover treatment effects under two assumptions: (1) network-based interference, where treatments on neighbors within connected networks affect one's outcomes, and (2) non-random treatment assignments influenced by confounders. To improve the efficiency of estimating potentially complex effects functions, we introduce an novel active learning approach: Active Learning in Causal Inference with Interference (ACI). This approach uses Gaussian process to flexibly model the direct and spillover treatment effects as a function of a continuous measure of neighbors' treatment assignment. The ACI framework sequentially identifies the experimental settings that demand further data. It further optimizes the treatment assignments under the network interference structure using genetic algorithms to achieve efficient learning outcome. By applying our method to simulation data and a Tencent game dataset, we demonstrate its feasibility in achieving accurate effects estimations with reduced data requirements. This ACI approach marks a significant advancement in the realm of data efficiency for causal inference, offering a robust and efficient alternative to traditional methodologies, particularly in scenarios characterized by complex interference patterns.
DeepAISE -- An End-to-End Development and Deployment of a Recurrent Neural Survival Model for Early Prediction of Sepsis
Shashikumar, Supreeth P., Josef, Christopher, Sharma, Ashish, Nemati, Shamim
Abstract: Sepsis, a dysregulated immune system response to infection, is among the leading causes of morbidity, mortality, and cost overruns in the Intensive Care Unit (ICU). Ear ly prediction of sepsis can improve situational awareness amongst clinicians and facilitate timely, protective interventions. While the application of predictive analytics in ICU patients has shown early promising results, much of the work has been encumbe red by high false - alarm rates. Efforts to improve specificity have been limited by several factors, most notably the difficulty of labeling sepsis onset time and the low prevalence of septic - events in the ICU. We show that by coupling a clinical criterion for defining sepsis onset time with a treatment policy (e.g., initiation of antibiotics within one hour of meeting the criterion), one may rank the relative utility of various criteria through offline policy evaluation. Given the optimal criterion, DeepAISE automatically learns predictive features related to higher - order interactions and temporal patterns among clinic al risk factors that maximize the data likelihood of observed time to septic events. DeepAISE has been incorporated into a clinical workflow, which provides real - time hourly sepsis risk scores. A comparative study of four baseline models indicates that Dee pAISE produces the most accurate predictions (AUC 0.90 and 0.87) and the lowest false alarm rates (FAR 0.20 and 0.26) in two separate cohorts (internal and external, respectively), while simultaneously producing interpretable representations of the clinica l time series and risk factors. Introduction Sepsis is a syndromic, life - threatening condition that arises when the body's response to infection injures its own internal organs (1) . Though the condition lacks the same public notoriety as other conditions like heart attacks, 6% of all hospitalized patients in the U nited S tates carry a primary diagnosis of sepsis as compared to 2.5% for the latter (2) . When all hospital deaths are ultimately considered, nearly 35% are attributable to sepsis (2) . This condition stands in stark contrast to heart attacks which have a mortality rate of 2.7 - 9.6% and only cost the US $12.1 billion ann ually, roughly half of the cost of sepsis (3) .